AITopics | content metadata

Collaborating Authors

content metadata

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

RouteNator: A Router-Based Multi-Modal Architecture for Generating Synthetic Training Data for Function Calling LLMs

Belavadi, Vibha, Vatsa, Tushar, Sultania, Dewang, Suresha, Suhas, Verma, Ishita, Chen, Cheng, King, Tracy Holloway, Friedrich, Michael

arXiv.org Artificial IntelligenceMay-26-2025

This paper addresses fine-tuning Large Language Models (LLMs) for function calling tasks when real user interaction data is unavailable. In digital content creation tools, where users express their needs through natural language queries that must be mapped to API calls, the lack of real-world task-specific data and privacy constraints for training on it necessitate synthetic data generation. Existing approaches to synthetic data generation fall short in diversity and complexity, failing to replicate real-world data distributions and leading to suboptimal performance after LLM fine-tuning. We present a novel router-based architecture that leverages domain resources like content metadata and structured knowledge graphs, along with text-to-text and vision-to-text language models to generate high-quality synthetic training data. Our architecture's flexible routing mechanism enables synthetic data generation that matches observed real-world distributions, addressing a fundamental limitation of traditional approaches. Evaluation on a comprehensive set of real user queries demonstrates significant improvements in both function classification accuracy and API parameter selection. Models fine-tuned with our synthetic data consistently outperform traditional approaches, establishing new benchmarks for function calling tasks.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.10495

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Beyond Labels: Leveraging Deep Learning and LLMs for Content Metadata

Agrawal, Saurabh, Trenkle, John, Kawale, Jaya

arXiv.org Artificial IntelligenceSep-15-2023

Content metadata plays a very important role in movie recommender systems as it provides valuable information about various aspects of a movie such as genre, cast, plot synopsis, box office summary, etc. Analyzing the metadata can help understand the user preferences to generate personalized recommendations and item cold starting. In this talk, we will focus on one particular type of metadata - \textit{genre} labels. Genre labels associated with a movie or a TV series help categorize a collection of titles into different themes and correspondingly setting up the audience expectation. We present some of the challenges associated with using genre label information and propose a new way of examining the genre information that we call as the \textit{Genre Spectrum}. The Genre Spectrum helps capture the various nuanced genres in a title and our offline and online experiments corroborate the effectiveness of the approach. Furthermore, we also talk about applications of LLMs in augmenting content metadata which could eventually be used to achieve effective organization of recommendations in user's 2-D home-grid.

genre label, metadata, movie, (9 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3604915.3608883

2309.08787

Country:

Asia > Singapore > Central Region > Singapore (0.06)
North America > United States > Minnesota (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.40)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Content metadata: why keyword extraction requires automated labelling -- EDIA

#artificialintelligenceJan-16-2022, 09:15:25 GMT

Keywords are no science but an art. There is no such thing as'the right keyword,' as we're talking about a core concept incorporated into a piece of content in the broadest form. Texts don't necessarily need to contain an exact keyword. For example, if the term'European Union' is used several times, 'European Commission' may be a suitable keyword even though the writer never uses the term. Despite this fluid definition, keywords should be understandable to those who try to find the right ones.

content metadata, keyword, keyword extraction require, (3 more...)

#artificialintelligence

Country: Europe (0.59)

Industry: Government > Regional Government > Europe Government (0.59)

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.43)

Add feedback

How Machine Learning Is Changing the Game for Content Metadata

#artificialintelligenceJan-8-2018, 06:59:22 GMT

These are the best of times for entertainment content owners and distributors--but they are also very challenging times. There is more content--often great content--than ever before and also vastly more competition due to the rise of streaming services, as well as on-demand options. This presents a challenge for content owners and distributors: how to stand out from the crowd and help viewers find what they want. Awash in all that content--not just professionally produced long-form content, but also highly viral digital-first content--viewers have a hard time wading through it all. In fact, it would take a single viewer more than 5 million years to watch the amount of video that crosses global IP networks each month, according to a recent Cisco Systems report. That's why it's imperative for content owners and distributors to make it easy for viewers to search and discover their content.

artificial intelligence, machine learning, metadata, (13 more...)

#artificialintelligence

Industry: Media > Television (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.75)

Add feedback